feat(experimentation): environment-scoped metrics & experiment results#7674
feat(experimentation): environment-scoped metrics & experiment results#7674gagantrivedi wants to merge 1 commit into
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub. 3 Skipped Deployments
|
Docker builds report
|
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #7674 +/- ##
========================================
Coverage 98.52% 98.53%
========================================
Files 1444 1449 +5
Lines 55083 55276 +193
========================================
+ Hits 54273 54466 +193
Misses 810 810 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Playwright Test Results (oss - depot-ubuntu-latest-16)Details
Playwright Test Results (oss - depot-ubuntu-latest-arm-16)Details
Playwright Test Results (private-cloud - depot-ubuntu-latest-arm-16)Details
Playwright Test Results (private-cloud - depot-ubuntu-latest-16)Details
|
Visual Regression19 screenshots compared. See report for details. |
86ccf2a to
435d91f
Compare
435d91f to
9568226
Compare
9568226 to
2fea3fa
Compare
… attachment
Add a reusable, environment-scoped Metric and the ExperimentMetric join that
attaches metrics to experiments.
- Models: Metric (numeric; count/sum/mean/occurrence aggregations + JSON
definition), ExperimentMetric (expected_direction; one attach per
experiment+metric); Experiment gains exposure_event ($flag_exposure) and
control_variant.
- Metric library CRUD under environments/{key}/experiment-metrics/, gated on
EXPERIMENT_FLAG + environment admin. Metrics are immutable for now (no
update); deletion blocked while attached to an active experiment.
- Attach/detach metrics under an experiment, with same-environment and
unique-attach validation.
Results computation (ClickHouse query builder + statistics) is intentionally
kept on a separate branch; this branch is models + API only.
2fea3fa to
c961486
Compare
| expected_direction = models.CharField( | ||
| max_length=20, | ||
| choices=ExpectedDirection.choices, | ||
| ) |
There was a problem hiding this comment.
Interesting. So you see it in the relation Experiment x Metrics. I could see it too there although i'm wondering if that's a reality.
Let's say we have those metrics:
- Conversion rate (up)
- Average basket (up)
- Time to activation (down)
- First time page render (down)
Is there a world in which we'd want an experiment to push it in the other direction ? If not i'd stick it to the metrics and maybe have the possibility to override it in an experiment (in v2)
There was a problem hiding this comment.
Ok I think i'm actually mixing 2 things:
- the metric polarity (is it better up or is it better down as per what it is) -> I think we should also add this one
- the experiment impact (it should go up, it should keep it same level, it should impact it down) especially as a guardrail =>
expected_directionthat we should keep
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request introduces Metric and ExperimentMetric models, along with corresponding API endpoints, serializers, permissions, audit logging, and unit tests to support experiment metrics. The review feedback highlights several critical improvements for robustness and business logic validation: using get_object_or_404 to handle missing or soft-deleted experiments cleanly, excluding soft-deleted experiments when checking for active metric attachments, adding defensive validation in ExperimentMetricSerializer (such as preventing modifications to completed experiments or attaching deleted metrics), and restricting the detachment of metrics from completed experiments.
| def _get_experiment(self) -> Experiment: | ||
| experiment: Experiment = Experiment.objects.get( | ||
| id=self.kwargs[self.experiment_url_kwarg], | ||
| environment__api_key=self.kwargs["environment_api_key"], | ||
| ) | ||
| return experiment |
There was a problem hiding this comment.
Using Experiment.objects.get directly in the view helper will raise Experiment.DoesNotExist if the experiment is not found, resulting in an unhandled 500 Internal Server Error. Additionally, since the soft-delete manager may include soft-deleted rows on direct PK lookups, we should explicitly ensure we only retrieve active (non-deleted) experiments. Using get_object_or_404 with deleted_at__isnull=True ensures a clean 404 Not Found response is returned.
| def _get_experiment(self) -> Experiment: | |
| experiment: Experiment = Experiment.objects.get( | |
| id=self.kwargs[self.experiment_url_kwarg], | |
| environment__api_key=self.kwargs["environment_api_key"], | |
| ) | |
| return experiment | |
| def _get_experiment(self) -> Experiment: | |
| from django.shortcuts import get_object_or_404 | |
| return get_object_or_404( | |
| Experiment, | |
| id=self.kwargs[self.experiment_url_kwarg], | |
| environment__api_key=self.kwargs["environment_api_key"], | |
| deleted_at__isnull=True, | |
| ) |
| if ( | ||
| ExperimentMetric.objects.filter(metric=instance) | ||
| .exclude(experiment__status=ExperimentStatus.COMPLETED) | ||
| .exists() | ||
| ): |
There was a problem hiding this comment.
When checking if a metric is attached to an active experiment, we must exclude soft-deleted experiments. Since ExperimentMetric is a standard model and does not automatically get deleted when an Experiment is soft-deleted, 'ghost' attachments to soft-deleted experiments will permanently block the deletion of the metric. Adding experiment__deleted_at__isnull=True prevents this issue.
if (
ExperimentMetric.objects.filter(
metric=instance,
experiment__deleted_at__isnull=True,
)
.exclude(experiment__status=ExperimentStatus.COMPLETED)
.exists()
):| def validate(self, attrs: dict[str, Any]) -> dict[str, Any]: | ||
| experiment: Experiment = self.context["experiment"] | ||
| metric: Metric = attrs.get("metric", getattr(self.instance, "metric", None)) | ||
|
|
||
| if metric.environment_id != experiment.environment_id: | ||
| raise serializers.ValidationError( | ||
| {"metric": "Metric must belong to the experiment's environment."} | ||
| ) | ||
|
|
||
| attached = experiment.experiment_metrics.all() | ||
| if isinstance(self.instance, ExperimentMetric): | ||
| attached = attached.exclude(pk=self.instance.pk) | ||
|
|
||
| if "metric" in attrs and attached.filter(metric=metric).exists(): | ||
| raise serializers.ValidationError( | ||
| {"metric": "Metric is already attached to this experiment."} | ||
| ) | ||
| return attrs |
There was a problem hiding this comment.
We should add defensive checks and business logic validation to ExperimentMetricSerializer.validate:
- Prevent attaching or modifying metrics of a completed experiment (
experiment.status == 'completed'). - Ensure
metricis notNoneto avoid potentialAttributeErroronmetric.environment_id. - Prevent attaching a soft-deleted metric (
metric.deleted_at is not None).
def validate(self, attrs: dict[str, Any]) -> dict[str, Any]:
experiment: Experiment = self.context["experiment"]
if experiment.status == "completed":
raise serializers.ValidationError(
"Cannot modify metrics of a completed experiment."
)
metric: Metric = attrs.get("metric", getattr(self.instance, "metric", None))
if not metric:
raise serializers.ValidationError({"metric": "Metric is required."})
if metric.deleted_at is not None:
raise serializers.ValidationError({"metric": "Cannot attach a deleted metric."})
if metric.environment_id != experiment.environment_id:
raise serializers.ValidationError(
{"metric": "Metric must belong to the experiment's environment."}
)
attached = experiment.experiment_metrics.all()
if isinstance(self.instance, ExperimentMetric):
attached = attached.exclude(pk=self.instance.pk)
if "metric" in attrs and attached.filter(metric=metric).exists():
raise serializers.ValidationError(
{"metric": "Metric is already attached to this experiment."}
)
return attrs| def perform_create(self, serializer: BaseSerializer[ExperimentMetric]) -> None: | ||
| serializer.save(experiment=self._get_experiment()) |
There was a problem hiding this comment.
We should prevent detaching metrics from a completed experiment. Overriding the destroy method in ExperimentMetricViewSet to check the experiment's status ensures the integrity of completed experiment results.
| def perform_create(self, serializer: BaseSerializer[ExperimentMetric]) -> None: | |
| serializer.save(experiment=self._get_experiment()) | |
| def perform_create(self, serializer: BaseSerializer[ExperimentMetric]) -> None: | |
| serializer.save(experiment=self._get_experiment()) | |
| def destroy(self, request: Request, *args: object, **kwargs: object) -> Response: | |
| experiment = self._get_experiment() | |
| if experiment.status == ExperimentStatus.COMPLETED: | |
| return Response( | |
| {"detail": "Cannot detach metrics from a completed experiment."}, | |
| status=status.HTTP_400_BAD_REQUEST, | |
| ) | |
| return super().destroy(request, *args, **kwargs) |
There was a problem hiding this comment.
Pull request overview
Adds a first-pass “metric library” for experimentation by introducing environment-scoped Metric objects and an ExperimentMetric join model, then exposing CRUD / attach / detach APIs under environment + experiment routes with auditing and permissions.
Changes:
- Add
Metric+ExperimentMetricmodels (with migration) and extend audit related-object types. - Introduce metric library endpoints (
/experiment-metrics/) and experiment metric attachment endpoints (/experiments/{id}/metrics/) with permissions, serializers, and audit logs. - Add unit tests covering metric CRUD, immutability expectations, attach/detach flows, and basic validation.
Reviewed changes
Copilot reviewed 13 out of 13 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| api/tests/unit/experimentation/test_metric_views.py | Adds API tests for environment-scoped metric library CRUD and permission / flag gating. |
| api/tests/unit/experimentation/test_metric_models.py | Adds model-level tests for defaults, uniqueness, and soft-delete behavior. |
| api/tests/unit/experimentation/test_experiment_metric_views.py | Adds API tests for attaching, listing, detaching, and updating experiment-metric relationships. |
| api/experimentation/views.py | Adds MetricViewSet and ExperimentMetricViewSet and wires metric audit logging + delete-guard logic. |
| api/experimentation/services.py | Adds create_metric_audit_log and reuses existing feature-flag helpers. |
| api/experimentation/serializers.py | Adds MetricSerializer and ExperimentMetricSerializer with definition + attachment validation. |
| api/experimentation/permissions.py | Adds MetricPermission to gate metric library endpoints on experiment flag + env admin. |
| api/experimentation/models.py | Introduces MetricAggregation, ExpectedDirection, Metric, and ExperimentMetric. |
| api/experimentation/migrations/0005_metrics.py | Creates DB tables for Metric and ExperimentMetric. |
| api/experimentation/metric_urls.py | Registers the metric library router under the environment. |
| api/experimentation/experiment_urls.py | Adds nested routes for /experiments/{id}/metrics/ using nested routers. |
| api/environments/urls.py | Includes the new experiment-metrics URL module under environments. |
| api/audit/related_object_type.py | Adds METRIC related object type for audit logs. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| if ( | ||
| ExperimentMetric.objects.filter(metric=instance) | ||
| .exclude(experiment__status=ExperimentStatus.COMPLETED) | ||
| .exists() | ||
| ): |
| def _get_experiment(self) -> Experiment: | ||
| experiment: Experiment = Experiment.objects.get( | ||
| id=self.kwargs[self.experiment_url_kwarg], | ||
| environment__api_key=self.kwargs["environment_api_key"], | ||
| ) | ||
| return experiment |
| class ExperimentMetricSerializer(serializers.ModelSerializer): # type: ignore[type-arg] | ||
| metric = serializers.PrimaryKeyRelatedField( # type: ignore[var-annotated] | ||
| queryset=Metric.objects.all(), | ||
| ) | ||
| metric_name = serializers.CharField(source="metric.name", read_only=True) | ||
| aggregation = serializers.CharField(source="metric.aggregation", read_only=True) | ||
|
|
What
Adds a reusable, environment-scoped
Metricand wires metrics into experiments end to end, ClickHouse-native. Builds on the existingexperimentationapp (Experiment,WarehouseConnection).Data model
Metric— environment-scoped, soft-delete.metric_type(numeric/conversion),aggregation(count/sum/mean), and a JSONdefinition(the recipe: event + optional filters/value/window). Immutable for now (no update endpoint).ExperimentMetric— attaches a metric to an experiment with anexpected_direction; unique per (experiment, metric).MetricResultSnapshot— freezes computed results once an experiment completes.Experimentgainsexposure_event(default$flag_exposure) andcontrol_variant.API (gated on
EXPERIMENT_FLAG+ environment admin)…/environments/{key}/experiment-metrics/— metric library: list / create / retrieve / delete. (Notmetrics/— that path is taken by the usage-metrics viewset.) Deletion is blocked while attached to an active experiment.…/experiments/{id}/metrics/— attach / list / detach, with same-environment + unique-attach validation.…/experiments/{id}/results/— per-metric per-variantn/mean/variance, relative lift, confidence interval, and a per-metric verdict. Cached to a snapshot once completed.Results engine
query.pybuilds theassignment(argMinfirst-touch on$flag_exposure.value) +metricCTEs from a metric definition. Untrusted values are bound params;LEFT JOIN … coalesce(…,0)keeps assigned-but-inactive identities as real zeros.stats.pycompares variants with a Welch/z two-sample test (CI included; for a 0/1 conversion column this reduces to a two-proportion z-test).Scope notes (intentional cuts for v1)
query.py. A per-event id in the ingest stream is the clean long-term fix.Testing
experimentationsuite green (191 passed);mypyclean; migrations complete.🤖 Generated with Claude Code